Generalization in a Large Committee Machine

نویسندگان

  • H Schwarze
  • J Hertz
چکیده

We study generalization in a committee machine with non{overlapping receptive elds trained to implement a function of the same structure. Using the replica method, we calculate the generalization error in the limit of a large number of hidden units. For continuous weights the generalization error falls oo asymptotically inversely proportional to , the number of training examples per weight. For binary weights we nd a discontinuous transition from poor to perfect generalization followed by a wide region of metastability. Broken replica symmetry is found within this region at low temperatures. The rst{order transition occurs at a lower and the metastability limit at a higher value of than in the simple perceptron. A good part of the utility of neural networks lies in their ability to learn a function from examples. Therefore there has been a good deal of theoretical work on calculating the generalization ability of diierent networks, i.e. their ability to give the correct output for an input not used in training. Most of these calculations have dealt with single{layer nets. Extensions to networks with a hidden layer include a model with small hidden receptive eldss1], some general results on networks whose outputs are continuous functions of their inputss2, 3], and a calculation for a so{called committee machine4] learning a function which could be implemented by a simple perceptron (i.e. one with no hidden units) in the high{temperature (i.e. high{noise) limitt5]. In this letter we make a step toward a more general understanding of nets with a single hidden layer by solving thètree' version of the committee machine in the limit of a large number of committee members (hidden units). In the present problem, the function to be learned is matched to the machine, i.e. thèteacher' and`student' are both tree committee machines. For networks with a single output and binary weights, the committee machine may already be regarded as the most general two{layer machine, since any combination of hidden{output weights can then be gauged to +1 by ipping the signs of all input{hidden weights to units which lead on to the output through negative connections. In the restricted 1

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Statistical Mechanics of Learning in a Large Committee Machine

In a We use statistical mechanics to study generalization in large committee machines. For an architecture with nonoverlapping receptive fields a replica calculation yields the generalization error in the limit of a large number of hidden units. For continuous weights the generalization error falls off asymptotically inversely proportional to Q, the number of training examples per weight. For b...

متن کامل

Universal Asymptotics in Committee Machines with Tree Architecture

On-line supervised learning in the general K Tree Committee Machine (TCM) is studied for a uniform distribution of inputs. Examples are corrupted by noise in the teacher output. From the diierential equations which describe the learning dynamics, the algorithm which optimizes the generalization ability is exactly obtained. For a large number of presented examples, the asymptotical behaviour of ...

متن کامل

Discontinuous Generalization in Large Committee Machines

J. Hertz Nordita Blegdamsvej 17 2100 Copenhagen 0 Denmark The problem of learning from examples in multilayer networks is studied within the framework of statistical mechanics. Using the replica formalism we calculate the average generalization error of a fully connected committee machine in the limit of a large number of hidden units. If the number of training examples is proportional to the n...

متن کامل

Multilayer Perceptrons May Learn Simple Rules Quickly

Zero temperature Gibbs learning is considered for a connected committee machine with K hidden units. For large K, the scale of the learning curve strongly depends on the target rule. When learning a perceptron, the sample size P needed for optimal generalization scales so that N P KN, where N is the dimension of the input. This even holds for a noisy perceptron rule if a new input is classiied ...

متن کامل

ar X iv : c on d - m at / 9 60 11 22 v 1 2 5 Ja n 19 96 Learning and Generalization Theories of Large Committee – Machines

The study of the distribution of volumes associated to the internal representations of learning examples allows us to derive the critical learning capacity (αc = 16 π √ lnK) of large committee machines, to verify the stability of the solution in the limit of a large number K of hidden units and to find a Bayesian generalization cross–over at α = K. PACS Numbers : 05.20 64.60 87.10 Typeset using...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1992